game 3
37ecd27608480aa3569a511a638ca74f-Supplemental.pdf
Tables 3 and 4 summarize hyperparameters for P A TE-FM and ALIBI respectively. Table 3: P A TE-FM (Algorithms 1 and 2) hyperparameters for select accuracy levels. By repeating this game multiple times, we can estimate the adversary's success rate and convert this The probability is taken over the bit b, the randomness of the mechanism M and the algorithm A. Theorem B.1. It now remains to be seen how we can bound the adversary's correct guessing rate "canaries", we can compute a lower bound on the adversary's We can improve the tightness of this bound further. The adversary simply looks at the model's confidence on (Game 3).
Self-Supervised Vision-Based Detection of the Active Speaker as a Prerequisite for Socially-Aware Language Acquisition
Stefanov, Kalin, Beskow, Jonas, Salvi, Giampiero
This paper presents a self-supervised method for detecting the active speaker in a multi-person spoken interaction scenario. We argue that this capability is a fundamental prerequisite for any artificial cognitive system attempting to acquire language in social settings. Our methods are able to detect an arbitrary number of possibly overlapping active speakers based exclusively on visual information about their face. Our methods do not rely on external annotations, thus complying with cognitive development. Instead, they use information from the auditory modality to support learning in the visual domain. The methods have been extensively evaluated on a large multi-person face-to-face interaction dataset. The results reach an accuracy of 80% on a multi-speaker setting. We believe this system represents an essential component of any artificial cognitive system or robotic platform engaging in social interaction.